Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 39.305
1.
Mikrochim Acta ; 191(5): 293, 2024 05 01.
Article En | MEDLINE | ID: mdl-38691169

To address the need for facile, rapid detection of pathogens in water supplies, a fluorescent sensing array platform based on antibiotic-stabilized metal nanoclusters was developed for the multiplex detection of pathogens. Using five common antibiotics, eight different nanoclusters (NCs) were synthesized including ampicillin stabilized copper NCs, cefepime stabilized gold and copper NCs, kanamycin stabilized gold and copper NCs, lysozyme stabilized gold NCs, and vancomycin stabilized gold/silver and copper NCs. Based on the different interaction of each NC with the bacteria strains, unique patterns were generated. Various machine learning algorithms were employed for pattern discernment, among which the artificial neural networks proved to have the highest performance, with an accuracy of 100%. The developed prediction model performed well on an independent test dataset and on real samples gathered from drinking water, tap water and the Anzali Lagoon water, with prediction accuracy of 96.88% and 95.14%, respectively. This work demonstrates how generic antibiotics can be implemented for NC synthesis and used as recognition elements for pathogen detection. Furthermore, it displays how merging machine learning techniques can elevate sensitivity of analytical devices.


Anti-Bacterial Agents , Copper , Gold , Metal Nanoparticles , Silver , Metal Nanoparticles/chemistry , Anti-Bacterial Agents/analysis , Anti-Bacterial Agents/chemistry , Gold/chemistry , Copper/chemistry , Silver/chemistry , Drinking Water/microbiology , Drinking Water/analysis , Neural Networks, Computer , Spectrometry, Fluorescence/methods , Machine Learning , Bacteria/isolation & purification , Fluorescent Dyes/chemistry , Vancomycin/chemistry , Water Microbiology , Kanamycin/analysis
2.
Aging Clin Exp Res ; 36(1): 108, 2024 May 08.
Article En | MEDLINE | ID: mdl-38717552

INTRODUCTION: Wrist-worn activity monitors have seen widespread adoption in recent times, particularly in young and sport-oriented cohorts, while their usage among older adults has remained relatively low. The main limitations are in regards to the lack of medical insights that current mainstream activity trackers can provide to older subjects. One of the most important research areas under investigation currently is the possibility of extrapolating clinical information from these wearable devices. METHODS: The research question of this study is understanding whether accelerometry data collected for 7-days in free-living environments using a consumer-based wristband device, in conjunction with data-driven machine learning algorithms, is able to predict hand grip strength and possible conditions categorized by hand grip strength in a general population consisting of middle-aged and older adults. RESULTS: The results of the regression analysis reveal that the performance of the developed models is notably superior to a simple mean-predicting dummy regressor. While the improvement in absolute terms may appear modest, the mean absolute error (6.32 kg for males and 4.53 kg for females) falls within the range considered sufficiently accurate for grip strength estimation. The classification models, instead, excel in categorizing individuals as frail/pre-frail, or healthy, depending on the T-score levels applied for frailty/pre-frailty definition. While cut-off values for frailty vary, the results suggest that the models can moderately detect characteristics associated with frailty (AUC-ROC: 0.70 for males, and 0.76 for females) and viably detect characteristics associated with frailty/pre-frailty (AUC-ROC: 0.86 for males, and 0.87 for females). CONCLUSIONS: The results of this study can enable the adoption of wearable devices as an efficient tool for clinical assessment in older adults with multimorbidities, improving and advancing integrated care, diagnosis and early screening of a number of widespread diseases.


Accelerometry , Hand Strength , Wrist , Humans , Hand Strength/physiology , Male , Female , Aged , Accelerometry/instrumentation , Accelerometry/methods , Middle Aged , Wrist/physiology , Wearable Electronic Devices , Aged, 80 and over , Machine Learning
3.
Nat Commun ; 15(1): 3860, 2024 May 08.
Article En | MEDLINE | ID: mdl-38719824

Dual blocker therapy (DBT) has the enhanced antitumor benefits than the monotherapy. Yet, few effective biomarkers are developed to monitor the therapy response. Herein, we investigate the DBT longitudinal plasma proteome profiling including 113 longitudinal samples from 22 patients who received anti-PD1 and anti-CTLA4 DBT therapy. The results show the immune response and cholesterol metabolism are upregulated after the first DBT cycle. Notably, the cholesterol metabolism is activated in the disease non-progressive group (DNP) during the therapy. Correspondingly, the clinical indicator prealbumin (PA), free triiodothyronine (FT3) and triiodothyronine (T3) show significantly positive association with the cholesterol metabolism. Furthermore, by integrating proteome and radiology approach, we observe the high-density lipoprotein partial remodeling are activated in DNP group and identify a candidate biomarker APOC3 that can reflect DBT response. Above, we establish a machine learning model to predict the DBT response and the model performance is validated by an independent cohort with balanced accuracy is 0.96. Thus, the plasma proteome profiling strategy evaluates the alteration of cholesterol metabolism and identifies a panel of biomarkers in DBT.


Cholesterol , Proteome , Humans , Cholesterol/blood , Cholesterol/metabolism , Proteome/metabolism , Female , Male , Middle Aged , CTLA-4 Antigen/antagonists & inhibitors , CTLA-4 Antigen/metabolism , CTLA-4 Antigen/blood , Programmed Cell Death 1 Receptor/antagonists & inhibitors , Programmed Cell Death 1 Receptor/metabolism , Programmed Cell Death 1 Receptor/blood , Biomarkers/blood , Aged , Triiodothyronine/blood , Machine Learning , Immune Checkpoint Inhibitors/therapeutic use , Immune Checkpoint Inhibitors/pharmacology , Neoplasms/drug therapy , Neoplasms/blood , Neoplasms/metabolism , Proteomics/methods
4.
Sci Rep ; 14(1): 10604, 2024 05 08.
Article En | MEDLINE | ID: mdl-38719879

Neoplasm is an umbrella term used to describe either benign or malignant conditions. The correlations between socioeconomic and environmental factors and the occurrence of new-onset of neoplasms have already been demonstrated in a body of research. Nevertheless, few studies have specifically dealt with the nature of relationship, significance of risk factors, and geographic variation of them, particularly in low- and middle-income communities. This study, thus, set out to (1) analyze spatiotemporal variations of the age-adjusted incidence rate (AAIR) of neoplasms in Iran throughout five time periods, (2) investigate relationships between a collection of environmental and socioeconomic indicators and the AAIR of neoplasms all over the country, and (3) evaluate geographical alterations in their relative importance. Our cross-sectional study design was based on county-level data from 2010 to 2020. AAIR of neoplasms data was acquired from the Institute for Health Metrics and Evaluation (IHME). HotSpot analyses and Anselin Local Moran's I indices were deployed to precisely identify AAIR of neoplasms high- and low-risk clusters. Multi-scale geographically weight regression (MGWR) analysis was worked out to evaluate the association between each explanatory variable and the AAIR of neoplasms. Utilizing random forests (RF), we also examined the relationships between environmental (e.g., UV index and PM2.5 concentration) and socioeconomic (e.g., Gini coefficient and literacy rate) factors and AAIR of neoplasms. AAIR of neoplasms displayed a significant increasing trend over the study period. According to the MGWR, the only factor that significantly varied spatially and was associated with the AAIR of neoplasms in Iran was the UV index. A good accuracy RF model was confirmed for both training and testing data with correlation coefficients R2 greater than 0.91 and 0.92, respectively. UV index and Gini coefficient ranked the highest variables in the prediction of AAIR of neoplasms, based on the relative influence of each variable. More research using machine learning approaches taking the advantages of considering all possible determinants is required to assess health strategies outcomes and properly formulate policy planning.


Machine Learning , Neoplasms , Socioeconomic Factors , Humans , Iran/epidemiology , Cross-Sectional Studies , Incidence , Neoplasms/epidemiology , Neoplasms/etiology , Geographic Information Systems , Risk Factors , Female , Male , Environmental Exposure/adverse effects
5.
Sci Rep ; 14(1): 10569, 2024 05 08.
Article En | MEDLINE | ID: mdl-38719918

Within the medical field of human assisted reproductive technology, a method for interpretable, non-invasive, and objective oocyte evaluation is lacking. To address this clinical gap, a workflow utilizing machine learning techniques has been developed involving automatic multi-class segmentation of two-dimensional images, morphometric analysis, and prediction of developmental outcomes of mature denuded oocytes based on feature extraction and clinical variables. Two separate models have been developed for this purpose-a model to perform multiclass segmentation, and a classifier model to classify oocytes as likely or unlikely to develop into a blastocyst (Day 5-7 embryo). The segmentation model is highly accurate at segmenting the oocyte, ensuring high-quality segmented images (masks) are utilized as inputs for the classifier model (mask model). The mask model displayed an area under the curve (AUC) of 0.63, a sensitivity of 0.51, and a specificity of 0.66 on the test set. The AUC underwent a reduction to 0.57 when features extracted from the ooplasm were removed, suggesting the ooplasm holds the information most pertinent to oocyte developmental competence. The mask model was further compared to a deep learning model, which also utilized the segmented images as inputs. The performance of both models combined in an ensemble model was evaluated, showing an improvement (AUC 0.67) compared to either model alone. The results of this study indicate that direct assessments of the oocyte are warranted, providing the first objective insights into key features for developmental competence, a step above the current standard of care-solely utilizing oocyte age as a proxy for quality.


Blastocyst , Machine Learning , Oocytes , Humans , Blastocyst/cytology , Blastocyst/physiology , Oocytes/cytology , Female , Embryonic Development , Adult , Fertilization in Vitro/methods , Image Processing, Computer-Assisted/methods
6.
Sci Rep ; 14(1): 10598, 2024 05 08.
Article En | MEDLINE | ID: mdl-38719940

A popular and widely suggested measure for assessing unilateral hand motor skills in stroke patients is the box and block test (BBT). Our study aimed to create an augmented reality enhanced version of the BBT (AR-BBT) and evaluate its correlation to the original BBT for stroke patients. Following G-power analysis, clinical examination, and inclusion-exclusion criteria, 31 stroke patients were included in this study. AR-BBT was developed using the Open Source Computer Vision Library (OpenCV). The MediaPipe's hand tracking library uses a palm and a hand landmark machine learning model to detect and track hands. A computer and a depth camera were employed in the clinical evaluation of AR-BBT following the principles of traditional BBT. A strong correlation was achieved between the number of blocks moved in the BBT and the AR-BBT on the hemiplegic side (Pearson correlation = 0.918) and a positive statistically significant correlation (p = 0.000008). The conventional BBT is currently the preferred assessment method. However, our approach offers an advantage, as it suggests that an AR-BBT solution could remotely monitor the assessment of a home-based rehabilitation program and provide additional hand kinematic information for hand dexterities in AR environment conditions. Furthermore, it employs minimal hardware equipment.


Augmented Reality , Hand , Machine Learning , Stroke Rehabilitation , Stroke , Humans , Male , Female , Middle Aged , Stroke/physiopathology , Aged , Hand/physiopathology , Hand/physiology , Stroke Rehabilitation/methods , Motor Skills/physiology , Adult
7.
BMC Bioinformatics ; 25(1): 181, 2024 May 08.
Article En | MEDLINE | ID: mdl-38720247

BACKGROUND: RNA sequencing combined with machine learning techniques has provided a modern approach to the molecular classification of cancer. Class predictors, reflecting the disease class, can be constructed for known tissue types using the gene expression measurements extracted from cancer patients. One challenge of current cancer predictors is that they often have suboptimal performance estimates when integrating molecular datasets generated from different labs. Often, the quality of the data is variable, procured differently, and contains unwanted noise hampering the ability of a predictive model to extract useful information. Data preprocessing methods can be applied in attempts to reduce these systematic variations and harmonize the datasets before they are used to build a machine learning model for resolving tissue of origins. RESULTS: We aimed to investigate the impact of data preprocessing steps-focusing on normalization, batch effect correction, and data scaling-through trial and comparison. Our goal was to improve the cross-study predictions of tissue of origin for common cancers on large-scale RNA-Seq datasets derived from thousands of patients and over a dozen tumor types. The results showed that the choice of data preprocessing operations affected the performance of the associated classifier models constructed for tissue of origin predictions in cancer. CONCLUSION: By using TCGA as a training set and applying data preprocessing methods, we demonstrated that batch effect correction improved performance measured by weighted F1-score in resolving tissue of origin against an independent GTEx test dataset. On the other hand, the use of data preprocessing operations worsened classification performance when the independent test dataset was aggregated from separate studies in ICGC and GEO. Therefore, based on our findings with these publicly available large-scale RNA-Seq datasets, the application of data preprocessing techniques to a machine learning pipeline is not always appropriate.


Machine Learning , Neoplasms , RNA-Seq , Humans , RNA-Seq/methods , Neoplasms/genetics , Transcriptome/genetics , Sequence Analysis, RNA/methods , Gene Expression Profiling/methods , Computational Biology/methods
8.
BMC Pregnancy Childbirth ; 24(1): 351, 2024 May 08.
Article En | MEDLINE | ID: mdl-38720272

BACKGROUND: Plasma microRNAs act as biomarkers for predicting and diagnosing diseases. Reliable non-invasive biomarkers for biochemical pregnancy loss have not been established. We aim to analyze the dynamic microRNA profiles during the peri-implantation period and investigate if plasma microRNAs could be non-invasive biomarkers predicting BPL. METHODS: In this study, we collected plasma samples from patients undergoing embryo transfer (ET) on ET day (ET0), 11 days after ET (ET11), and 14 days after ET (ET14). Patients were divided into the NP (negative pregnancy), BPL (biochemical pregnancy loss), and CP (clinical pregnancy) groups according to serum hCG levels at day11~14 and ultrasound at day28~35 following ET. MicroRNA profiles at different time-points were detected by miRNA-sequencing. We analyzed plasma microRNA signatures for BPL at the peri-implantation stage, we characterized the dynamic microRNA changes during the implantation period, constructed a microRNA co-expression network, and established predictive models for BPL. Finally, the sequencing results were confirmed by Taqman RT-qPCR. RESULTS: BPL patients have distinct plasma microRNA profiles compared to CP patients at multiple time-points during the peri-implantation period. Machine learning models revealed that plasma microRNAs could predict BPL. RT-qPCR confirmed that miR-181a-2-3p, miR-9-5p, miR-150-3p, miR-150-5p, and miR-98-5p, miR-363-3p were significantly differentially expressed between patients with different reproductive outcomes. CONCLUSION: Our study highlights the non-invasive value of plasma microRNAs in predicting BPL.


Abortion, Spontaneous , Biomarkers , Embryo Transfer , MicroRNAs , Humans , Female , Pregnancy , MicroRNAs/blood , Adult , Biomarkers/blood , Abortion, Spontaneous/blood , Embryo Implantation , Machine Learning
9.
Lipids Health Dis ; 23(1): 137, 2024 May 08.
Article En | MEDLINE | ID: mdl-38720280

BACKGROUND: Evidence suggests that hepatocyte mitochondrial dysfunction leads to abnormal lipid metabolism, redox imbalance, and programmed cell death, driving the onset and progression of non-alcoholic steatohepatitis (NASH). Identifying hub mitochondrial genes linked to NASH may unveil potential therapeutic targets. METHODS: Mitochondrial hub genes implicated in NASH were identified via analysis using 134 algorithms. RESULTS: The Random Forest algorithm (RF), the most effective among the 134 algorithms, identified three genes: Aldo-keto reductase family 1 member B10 (AKR1B10), thymidylate synthase (TYMS), and triggering receptor expressed in myeloid cell 2 (TREM2). They were upregulated and positively associated with genes promoting inflammation, genes involved in lipid synthesis, fibrosis, and nonalcoholic steatohepatitis activity scores in patients with NASH. Moreover, using these three genes, patients with NASH were accurately categorized into cluster 1, exhibiting heightened disease severity, and cluster 2, distinguished by milder disease activity. CONCLUSION: These three genes are pivotal mitochondrial genes implicated in NASH progression.


Algorithms , Machine Learning , Non-alcoholic Fatty Liver Disease , Non-alcoholic Fatty Liver Disease/genetics , Non-alcoholic Fatty Liver Disease/pathology , Humans , Mitochondria/genetics , Mitochondria/metabolism , Lipid Metabolism/genetics , Aldo-Keto Reductases/genetics , Aldo-Keto Reductases/metabolism , Genes, Mitochondrial
10.
Respir Res ; 25(1): 199, 2024 May 08.
Article En | MEDLINE | ID: mdl-38720331

BACKGROUND: Bronchopulmonary dysplasia-associated pulmonary hypertension (BPD-PH) remains a devastating clinical complication seriously affecting the therapeutic outcome of preterm infants. Hence, early prevention and timely diagnosis prior to pathological change is the key to reducing morbidity and improving prognosis. Our primary objective is to utilize machine learning techniques to build predictive models that could accurately identify BPD infants at risk of developing PH. METHODS: The data utilized in this study were collected from neonatology departments of four tertiary-level hospitals in China. To address the issue of imbalanced data, oversampling algorithms synthetic minority over-sampling technique (SMOTE) was applied to improve the model. RESULTS: Seven hundred sixty one clinical records were collected in our study. Following data pre-processing and feature selection, 5 of the 46 features were used to build models, including duration of invasive respiratory support (day), the severity of BPD, ventilator-associated pneumonia, pulmonary hemorrhage, and early-onset PH. Four machine learning models were applied to predictive learning, and after comprehensive selection a model was ultimately selected. The model achieved 93.8% sensitivity, 85.0% accuracy, and 0.933 AUC. A score of the logistic regression formula greater than 0 was identified as a warning sign of BPD-PH. CONCLUSIONS: We comprehensively compared different machine learning models and ultimately obtained a good prognosis model which was sufficient to support pediatric clinicians to make early diagnosis and formulate a better treatment plan for pediatric patients with BPD-PH.


Bronchopulmonary Dysplasia , Hypertension, Pulmonary , Machine Learning , Humans , Bronchopulmonary Dysplasia/diagnosis , Infant, Newborn , Hypertension, Pulmonary/diagnosis , Male , Female , Retrospective Studies , Infant, Extremely Premature , Infant, Premature
11.
Cancer Imaging ; 24(1): 59, 2024 May 08.
Article En | MEDLINE | ID: mdl-38720384

BACKGROUND: To develop a magnetic resonance imaging (MRI)-based radiomics signature for evaluating the risk of soft tissue sarcoma (STS) disease progression. METHODS: We retrospectively enrolled 335 patients with STS (training, validation, and The Cancer Imaging Archive sets, n = 168, n = 123, and n = 44, respectively) who underwent surgical resection. Regions of interest were manually delineated using two MRI sequences. Among 12 machine learning-predicted signatures, the best signature was selected, and its prediction score was inputted into Cox regression analysis to build the radiomics signature. A nomogram was created by combining the radiomics signature with a clinical model constructed using MRI and clinical features. Progression-free survival was analyzed in all patients. We assessed performance and clinical utility of the models with reference to the time-dependent receiver operating characteristic curve, area under the curve, concordance index, integrated Brier score, decision curve analysis. RESULTS: For the combined features subset, the minimum redundancy maximum relevance-least absolute shrinkage and selection operator regression algorithm + decision tree classifier had the best prediction performance. The radiomics signature based on the optimal machine learning-predicted signature, and built using Cox regression analysis, had greater prognostic capability and lower error than the nomogram and clinical model (concordance index, 0.758 and 0.812; area under the curve, 0.724 and 0.757; integrated Brier score, 0.080 and 0.143, in the validation and The Cancer Imaging Archive sets, respectively). The optimal cutoff was - 0.03 and cumulative risk rates were calculated. DATA CONCLUSION: To assess the risk of STS progression, the radiomics signature may have better prognostic power than a nomogram/clinical model.


Disease Progression , Magnetic Resonance Imaging , Nomograms , Sarcoma , Humans , Sarcoma/diagnostic imaging , Sarcoma/surgery , Sarcoma/pathology , Male , Female , Middle Aged , Retrospective Studies , Magnetic Resonance Imaging/methods , Adult , Aged , Machine Learning , Prognosis , Young Adult , Soft Tissue Neoplasms/diagnostic imaging , Soft Tissue Neoplasms/surgery , Soft Tissue Neoplasms/pathology , ROC Curve , Radiomics
12.
Elife ; 122024 May 09.
Article En | MEDLINE | ID: mdl-38722146

Imputing data is a critical issue for machine learning practitioners, including in the life sciences domain, where missing clinical data is a typical situation and the reliability of the imputation is of great importance. Currently, there is no canonical approach for imputation of clinical data and widely used algorithms introduce variance in the downstream classification. Here we propose novel imputation methods based on determinantal point processes (DPP) that enhance popular techniques such as the multivariate imputation by chained equations and MissForest. Their advantages are twofold: improving the quality of the imputed data demonstrated by increased accuracy of the downstream classification and providing deterministic and reliable imputations that remove the variance from the classification results. We experimentally demonstrate the advantages of our methods by performing extensive imputations on synthetic and real clinical data. We also perform quantum hardware experiments by applying the quantum circuits for DPP sampling since such quantum algorithms provide a computational advantage with respect to classical ones. We demonstrate competitive results with up to 10 qubits for small-scale imputation tasks on a state-of-the-art IBM quantum processor. Our classical and quantum methods improve the effectiveness and robustness of clinical data prediction modeling by providing better and more reliable data imputations. These improvements can add significant value in settings demanding high precision, such as in pharmaceutical drug trials where our approach can provide higher confidence in the predictions made.


Algorithms , Machine Learning , Humans , Data Interpretation, Statistical , Reproducibility of Results
13.
Front Immunol ; 15: 1347139, 2024.
Article En | MEDLINE | ID: mdl-38726016

Background: Autism spectrum disorder (ASD) is a disease characterized by social disorder. Recently, the population affected by ASD has gradually increased around the world. There are great difficulties in diagnosis and treatment at present. Methods: The ASD datasets were obtained from the Gene Expression Omnibus database and the immune-relevant genes were downloaded from a previously published compilation. Subsequently, we used WGCNA to screen the modules related to the ASD and immune. We also choose the best combination and screen out the core genes from Consensus Machine Learning Driven Signatures (CMLS). Subsequently, we evaluated the genetic correlation between immune cells and ASD used GNOVA. And pleiotropic regions identified by PLACO and CPASSOC between ASD and immune cells. FUMA was used to identify pleiotropic regions, and expression trait loci (EQTL) analysis was used to determine their expression in different tissues and cells. Finally, we use qPCR to detect the gene expression level of the core gene. Results: We found a close relationship between neutrophils and ASD, and subsequently, CMLS identified a total of 47 potential candidate genes. Secondly, GNOVA showed a significant genetic correlation between neutrophils and ASD, and PLACO and CPASSOC identified a total of 14 pleiotropic regions. We annotated the 14 regions mentioned above and identified a total of 6 potential candidate genes. Through EQTL, we found that the CFLAR gene has a specific expression pattern in neutrophils, suggesting that it may serve as a potential biomarker for ASD and is closely related to its pathogenesis. Conclusions: In conclusion, our study yields unprecedented insights into the molecular and genetic heterogeneity of ASD through a comprehensive bioinformatics analysis. These valuable findings hold significant implications for tailoring personalized ASD therapies.


Autism Spectrum Disorder , Computational Biology , Genetic Predisposition to Disease , Quantitative Trait Loci , Humans , Autism Spectrum Disorder/genetics , Autism Spectrum Disorder/immunology , Computational Biology/methods , Gene Expression Profiling , Gene Regulatory Networks , Machine Learning , Databases, Genetic , Immunogenetics , Neutrophils/immunology , Neutrophils/metabolism , Transcriptome
14.
Front Public Health ; 12: 1347219, 2024.
Article En | MEDLINE | ID: mdl-38726233

Background: Osteoporosis is becoming more common worldwide, imposing a substantial burden on individuals and society. The onset of osteoporosis is subtle, early detection is challenging, and population-wide screening is infeasible. Thus, there is a need to develop a method to identify those at high risk for osteoporosis. Objective: This study aimed to develop a machine learning algorithm to effectively identify people with low bone density, using readily available demographic and blood biochemical data. Methods: Using NHANES 2017-2020 data, participants over 50 years old with complete femoral neck BMD data were selected. This cohort was randomly divided into training (70%) and test (30%) sets. Lasso regression selected variables for inclusion in six machine learning models built on the training data: logistic regression (LR), support vector machine (SVM), gradient boosting machine (GBM), naive Bayes (NB), artificial neural network (ANN) and random forest (RF). NHANES data from the 2013-2014 cycle was used as an external validation set input into the models to verify their generalizability. Model discrimination was assessed via AUC, accuracy, sensitivity, specificity, precision and F1 score. Calibration curves evaluated goodness-of-fit. Decision curves determined clinical utility. The SHAP framework analyzed variable importance. Results: A total of 3,545 participants were included in the internal validation set of this study, of whom 1870 had normal bone density and 1,675 had low bone density Lasso regression selected 19 variables. In the test set, AUC was 0.785 (LR), 0.780 (SVM), 0.775 (GBM), 0.729 (NB), 0.771 (ANN), and 0.768 (RF). The LR model has the best discrimination and a better calibration curve fit, the best clinical net benefit for the decision curve, and it also reflects good predictive power in the external validation dataset The top variables in the LR model were: age, BMI, gender, creatine phosphokinase, total cholesterol and alkaline phosphatase. Conclusion: The machine learning model demonstrated effective classification of low BMD using blood biomarkers. This could aid clinical decision making for osteoporosis prevention and management.


Bone Density , Machine Learning , Osteoporosis , Humans , Female , Middle Aged , Male , Osteoporosis/diagnosis , Aged , Algorithms , Nutrition Surveys , Logistic Models , Support Vector Machine
15.
PLoS One ; 19(5): e0303137, 2024.
Article En | MEDLINE | ID: mdl-38722911

The Asian tiger mosquito, Aedes albopictus, is a significant public health concern owing to its expanding habitat and vector competence. Disease outbreaks attributed to this species have been reported in areas under its invasion, and its northward expansion in Japan has caused concern because of the potential for dengue virus infection in newly populated areas. Accurate prediction of Ae. albopictus distribution is crucial to prevent the spread of the disease. However, limited studies have focused on the prediction of Ae. albopictus distribution in Japan. Herein, we used the random forest model, a machine learning approach, to predict the current and potential future habitat ranges of Ae. albopictus in Japan. The model revealed that these mosquitoes prefer urban areas over forests in Japan on the current map. Under predictions for the future, the species will expand its range to the surrounding areas and eventually reach many areas of northeastern Kanto, Tohoku District, and Hokkaido, with a few variations in different scenarios. However, the affected human population is predicted to decrease owing to the declining birth rate. Anthropogenic and climatic factors contribute to range expansion, and urban size and population have profound impacts. This prediction map can guide responses to the introduction of this species in new areas, advance the spatial knowledge of diseases vectored by it, and mitigate the possible disease burden. To our knowledge, this is the first distribution-modelling prediction for Ae. albopictus with a focus on Japan.


Aedes , Mosquito Vectors , Animals , Aedes/virology , Aedes/physiology , Japan , Mosquito Vectors/virology , Ecosystem , Humans , Animal Distribution , Dengue/transmission , Dengue/epidemiology , Machine Learning , Models, Biological
16.
PLoS One ; 19(5): e0300186, 2024.
Article En | MEDLINE | ID: mdl-38722932

INTRODUCTION: Endometriosis is a chronic disease that affects up to 190 million women and those assigned female at birth and remains unresolved mainly in terms of etiology and optimal therapy. It is defined by the presence of endometrium-like tissue outside the uterine cavity and is commonly associated with chronic pelvic pain, infertility, and decreased quality of life. Despite the availability of various screening methods (e.g., biomarkers, genomic analysis, imaging techniques) intended to replace the need for invasive surgery, the time to diagnosis remains in the range of 4 to 11 years. AIMS: This study aims to create a large prospective data bank using the Lucy mobile health application (Lucy app) and analyze patient profiles and structured clinical data. In addition, we will investigate the association of removed or restricted dietary components with quality of life, pain, and central pain sensitization. METHODS: A baseline and a longitudinal questionnaire in the Lucy app collects real-world, self-reported information on symptoms of endometriosis, socio-demographics, mental and physical health, economic factors, nutritional, and other lifestyle factors. 5,000 women with confirmed endometriosis and 5,000 women without diagnosed endometriosis in a control group will be enrolled and followed up for one year. With this information, any connections between recorded symptoms and endometriosis will be analyzed using machine learning. CONCLUSIONS: We aim to develop a phenotypic description of women with endometriosis by linking the collected data with existing registry-based information on endometriosis diagnosis, healthcare utilization, and big data approach. This may help to achieve earlier detection of endometriosis with pelvic pain and significantly reduce the current diagnostic delay. Additionally, we may identify dietary components that worsen the quality of life and pain in women with endometriosis, upon which we can create real-world data-based nutritional recommendations.


Early Diagnosis , Endometriosis , Machine Learning , Quality of Life , Self Report , Humans , Endometriosis/diagnosis , Female , Adult , Pelvic Pain/diagnosis , Prospective Studies , Mobile Applications
17.
PLoS One ; 19(5): e0300125, 2024.
Article En | MEDLINE | ID: mdl-38722967

With the increasing problem of antimicrobial drug resistance, the search for new antimicrobial agents has become a crucial task in the field of medicine. Antimicrobial peptides, as a class of naturally occurring antimicrobial agents, possess broad-spectrum antimicrobial activity and lower risk of resistance development. However, traditional screening methods for antimicrobial peptides are inefficient, necessitating the development of an efficient screening model. In this study, we aimed to develop an ensemble learning model for the identification of antimicrobial peptides, named E-CLEAP, based on the Multilayer Perceptron Classifier (MLP Classifier). By considering multiple features, including amino acid composition (AAC) and pseudo amino acid composition (PseAAC) of antimicrobial peptides, we aimed to improve the accuracy and generalization ability of the identification process. To validate the superiority of our model, we employed five-fold cross-validation and compared it with other commonly used methods for antimicrobial peptide identification. In the experimental results on an independent test set, E-CLEAP achieved accuracies of 97.33% and 84% for the AAC and PseAAC features, respectively. The results demonstrated that our model outperformed other methods in all evaluation metrics. The findings of this study highlight the potential of the E-CLEAP model in enhancing the efficiency and accuracy of antimicrobial peptide screening, which holds significant implications for drug development, disease treatment, and biotechnology advancement. Future research can further optimize the model by incorporating additional features and information, as well as validating its reliability on larger datasets and in real-world environments. The source code and all datasets are publicly available at https://github.com/Wangsicheng52/E-CLEAP.


Antimicrobial Peptides , Antimicrobial Peptides/chemistry , Antimicrobial Peptides/pharmacology , Machine Learning , Anti-Infective Agents/pharmacology , Anti-Infective Agents/chemistry , Amino Acids/chemistry
18.
PLoS One ; 19(5): e0303519, 2024.
Article En | MEDLINE | ID: mdl-38723044

OBJECTIVE: To establish whether or not a natural language processing technique could identify two common inpatient neurosurgical comorbidities using only text reports of inpatient head imaging. MATERIALS AND METHODS: A training and testing dataset of reports of 979 CT or MRI scans of the brain for patients admitted to the neurosurgery service of a single hospital in June 2021 or to the Emergency Department between July 1-8, 2021, was identified. A variety of machine learning and deep learning algorithms utilizing natural language processing were trained on the training set (84% of the total cohort) and tested on the remaining images. A subset comparison cohort (n = 76) was then assessed to compare output of the best algorithm against real-life inpatient documentation. RESULTS: For "brain compression", a random forest classifier outperformed other candidate algorithms with an accuracy of 0.81 and area under the curve of 0.90 in the testing dataset. For "brain edema", a random forest classifier again outperformed other candidate algorithms with an accuracy of 0.92 and AUC of 0.94 in the testing dataset. In the provider comparison dataset, for "brain compression," the random forest algorithm demonstrated better accuracy (0.76 vs 0.70) and sensitivity (0.73 vs 0.43) than provider documentation. For "brain edema," the algorithm again demonstrated better accuracy (0.92 vs 0.84) and AUC (0.45 vs 0.09) than provider documentation. DISCUSSION: A natural language processing-based machine learning algorithm can reliably and reproducibly identify selected common neurosurgical comorbidities from radiology reports. CONCLUSION: This result may justify the use of machine learning-based decision support to augment provider documentation.


Comorbidity , Natural Language Processing , Humans , Algorithms , Inpatients/statistics & numerical data , Female , Male , Machine Learning , Magnetic Resonance Imaging/methods , Documentation , Middle Aged , Tomography, X-Ray Computed , Neurosurgical Procedures , Aged , Deep Learning
19.
PLoS One ; 19(5): e0303199, 2024.
Article En | MEDLINE | ID: mdl-38723048

This paper presents an optimized preparation process for external ointment using the Definitive Screening Design (DSD) method. The ointment is a Traditional Chinese Medicine (TCM) formula developed by Professor WYH, a renowned TCM practitioner in Jiangsu Province, China, known for its proven clinical efficacy. In this study, a stepwise regression model was employed to analyze the relationship between key process factors (such as mixing speed and time) and rheological parameters. Machine learning techniques, including Monte Carlo simulation, decision tree analysis, and Gaussian process, were used for parameter optimization. Through rigorous experimentation and verification, we have successfully identified the optimal preparation process for WYH ointment. The optimized parameters included drug ratio of 24.5%, mixing time of 8 min, mixing speed of 1175 rpm, petroleum dosage of 79 g, liquid paraffin dosage of 6.7 g. The final ointment formulation was prepared using method B. This research not only contributes to the optimization of the WYH ointment preparation process but also provides valuable insights and practical guidance for designing the preparation processes of other TCM ointments. This advanced DSD method enhances the screening approach for identifying the best preparation process, thereby improving the scientific rigor and quality of TCM ointment preparation processes.


Machine Learning , Ointments , Rheology , Drugs, Chinese Herbal/chemistry , Drugs, Chinese Herbal/administration & dosage , Medicine, Chinese Traditional , Drug Compounding/methods , Sodium Dodecyl Sulfate/chemistry , Monte Carlo Method
20.
Protein Sci ; 33(6): e5007, 2024 Jun.
Article En | MEDLINE | ID: mdl-38723187

The identification of an effective inhibitor is an important starting step in drug development. Unfortunately, many issues such as the characterization of protein binding sites, the screening library, materials for assays, etc., make drug screening a difficult proposition. As the size of screening libraries increases, more resources will be inefficiently consumed. Thus, new strategies are needed to preprocess and focus a screening library towards a targeted protein. Herein, we report an ensemble machine learning (ML) model to generate a CDK8-focused screening library. The ensemble model consists of six different algorithms optimized for CDK8 inhibitor classification. The models were trained using a CDK8-specific fragment library along with molecules containing CDK8 activity. The optimized ensemble model processed a commercial library containing 1.6 million molecules. This resulted in a CDK8-focused screening library containing 1,672 molecules, a reduction of more than 99.90%. The CDK8-focused library was then subjected to molecular docking, and 25 candidate compounds were selected. Enzymatic assays confirmed six CDK8 inhibitors, with one compound producing an IC50 value of ≤100 nM. Analysis of the ensemble ML model reveals the role of the CDK8 fragment library during training. Structural analysis of molecules reveals the hit compounds to be structurally novel CDK8 inhibitors. Together, the results highlight a pipeline for curating a focused library for a specific protein target, such as CDK8.


Cyclin-Dependent Kinase 8 , Machine Learning , Molecular Docking Simulation , Protein Kinase Inhibitors , Cyclin-Dependent Kinase 8/antagonists & inhibitors , Cyclin-Dependent Kinase 8/chemistry , Cyclin-Dependent Kinase 8/metabolism , Protein Kinase Inhibitors/chemistry , Protein Kinase Inhibitors/pharmacology , Humans , Small Molecule Libraries/chemistry , Small Molecule Libraries/pharmacology , Drug Evaluation, Preclinical/methods
...